Contrasting Machine Learning Approaches for Microtext Classification
نویسنده
چکیده
The goal is classification of microtext: classifying lines of military chat, or posts, which contain items of interest. This paper evaluates non-linear statistical data modeling techniques, and compares with our previous results using several text categorization and feature selection methodologies. The chat posts are examples of 'microtext', or text that is generally very short in length, semi-structured, and characterized by unstructured or informal grammar and language. These three distinct attributes cause different results than traditional long-form free text. In this paper, we further characterize microtext. Highly accurate classification of microtext entries is crucial to facilitate more complex information extraction. Although this study focused specifically on tactical updates via chat, we believe the findings are applicable to content of a similar linguistic structure regardless of domain. This includes other microtext sources such as IM/XMPP, SMS, voice transcriptions, and micro-blogging such as Twitter(tm).
منابع مشابه
Learning Ontologies from the Web for Microtext Processing
We build a mechanism to form an ontology of entities which improves a relevance of matching and searching microtext. Ontology construction starts from the seed entities and mines the web for new entities associated with them. To form these new entities, machine learning of syntactic parse trees (syntactic generalization) is applied to form commonalities between various search results for existi...
متن کاملA Microtext Corpus for Persuasion Detection in Dialog
Automatic detection of persuasion is essential for machine interaction on the social web. To facilitate automated persuasion detection, we present a novel microtext corpus derived from hostage negotiation transcripts as well as a detailed manual (codebook) for persuasion annotation. Our corpus, called the NPS Persuasion Corpus, consists of 37 transcripts from four sets of hostage negotiation tr...
متن کاملImage Classification via Sparse Representation and Subspace Alignment
Image representation is a crucial problem in image processing where there exist many low-level representations of image, i.e., SIFT, HOG and so on. But there is a missing link across low-level and high-level semantic representations. In fact, traditional machine learning approaches, e.g., non-negative matrix factorization, sparse representation and principle component analysis are employed to d...
متن کاملProstate cancer radiomics: A study on IMRT response prediction based on MR image features and machine learning approaches
Introduction: To develop different radiomic models based on radiomic features and machine learning methods to predict early intensity modulated radiation therapy (IMRT) response. Materials and Methods: Thirty prostate patients were included. All patients underwent pre ad post-IMRT T2 weighted and apparent diffusing coefficient (ADC) magnetic resonance imagi...
متن کاملFault Detection of Anti-friction Bearing using Ensemble Machine Learning Methods
Anti-Friction Bearing (AFB) is a very important machine component and its unscheduled failure leads to cause of malfunction in wide range of rotating machinery which results in unexpected downtime and economic loss. In this paper, ensemble machine learning techniques are demonstrated for the detection of different AFB faults. Initially, statistical features were extracted from temporal vibratio...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011